A Probabilistic NF2 Relational Algebra for Integrated Information Retrieval and Database Systems
نویسنده
چکیده
The integration of information retrieval (IR) and database systems requires a data model which allows for modelling documents as entities, representing uncertainty and vagueness and performing uncertain inference. For this purpose, we present a probabilistic data model based on relations in non-rst-normal-form (NF2). Here, tuples are assigned probabilistic weights giving the probability that a tuple belongs to a relation. Thus, the set of weighted index terms of a document are represented as a probabilistic subrelation. In a similar way, imprecise attribute values are modelled as a set-valued attribute. We redeene the relational operators for this type of relations such that the result of each operator is again a probabilistic NF2 relation, where the weight of a tuple gives the probability that this tuple belongs to the result. By ordering the tuples according to decreasing probabilities, the model yields a ranking of answers like in most IR models. This eeect also can be used for typical database queries involving imprecise attribute values as well as for combinations of database and IR queries.
منابع مشابه
A Probabilistic NF2 Relational Algebra for Imprecision in Databases
We present a probabilistic data model which is based on relations in non-rst-normal-form (NF2). Here, tuples are assigned probabilistic weights giving the probability that a tuple belongs to a relation. This way, imprecise attribute values are modelled as a probabilistic subrelation. For information retrieval, the set of weighted index terms of a document can be represented in the same way, thu...
متن کاملData Structures for an Integrated Data Base Management and Information Retrieval System
New applications like office information systems need interfaces to data bases which integrate classical data manipulation with management and retrieval of textual (“unformatted”) data. The relational data model is widely accepted as a high level interface to classical (“formatted”) data management. It turns out, however, to be inconvenient for handling even simple data structures as commonly u...
متن کاملModels for Integrated Information Retrieval and Database Systems
In this paper, we show that there is a mismatch between information retrieval (IR) and database (DB) concepts, and we devise solutions for this problem. DB oriented approaches have to distinguish between the logical and the content structure of objects, and should also consider the layout structure. Data independence—not regarded in IR before—can be achieved by using the notion of vague predica...
متن کاملLogical and Conceptual Models for the Integration of Information Retrieval and Database Systems
We present two new approaches to the problem of integrating information retrieval (IR) and database (DB) systems. On the logical level, IR is based on uncertain inference, which is a generalization to the certain inference process employed in DB systems. As an implementation of this concept, we present a probabilistic relational algebra. On the conceptual level, we distinguish between the logic...
متن کاملBridging Information Retrieval and Databases
For bridging the gap between information retrieval (IR) and databases (DB), this article focuses on the logical view. We claim that IR should adopt three major concepts from DB, namely inference, vague predicates and expressive query languages. By regarding IR as uncertain inference, probabilistic versions of relational algebra and Datalog yield very powerful inference mechanisms for IR as well...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1996